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Abstract 

Let H he a, fixed graph with h vertices. The graph removal lemma states that every graph on n 
vertices with o{n^) copies of H can be made H-iree by removing o(n^) edges. We give a new proof 
which avoids Szemeredi's regularity lemma and gives a better bound. This approach also works 
to give improved bounds for the directed and multicolored analogues of the graph removal lemma. 
This answers questions of Alon and Gowers. 

1 Introduction 

Szemeredi's regularity lemma [3T] is one of the most powerful tools in graph theory. It was introduced 
by Szemeredi in his proof [30j of the Erdos-Turan conjecture on long arithmetic progressions in dense 
subsets of the integers. Roughly speaking, it says that every large graph can be partitioned into a small 
number of parts such that the bipartite subgraph between almost every pair of parts is random-like. 
This structure is useful for approximating the number of copies of some fixed subgraph. 

To properly state the regularity lemma requires some terminology. The edge density d{X, Y) 
between two subsets of vertices of a graph G is the fraction of pairs {x,y) & X x Y that are edges 
of G. A pair {X,Y) of vertex sets is called e-regular if for all X' d X and Y' (ZY with \X'\ > €\X\ 
and \Y'\ > e\Y\, we have \d{X',Y') - d{X,Y)\ < e. A partition F = Vi U . . . U Vfc is called equitable 
if ||Vi| — \ Vj\\ < 1 for all i and j. The regularity lemma states that for each e > 0, there is a positive 
integer M(e) such that the vertices of any graph G can be equitably partitioned V{G) = Vi U . . . U 14 
into k < M(e) parts where all but at most ek'^ of the pairs {Vi,Vj) are e-regular. For more background 
on the regularity lemma, see the excellent survey by Komlos and Simonovits |19j . 

In the regularity lemma, M(e) can be taken to be a tower of twos of height proportional to e~^. On 
the other hand, Gowers [12] proved a lower bound on M(e) which is a tower of twos of height propor- 
tional to e^^/^^, thus demonstrating that M(e) is inherently large as a function of e^^. Unfortunately, 
this implies that the bounds obtained by applications of the regularity lemma are usually quite poor. 
It remains an important problem to determine if new proofs giving better quantitative estimates for 
certain applications of the regularity lemma exist (see, e.g., [H]). One such improvement is the proof 
of Gowers |15] of Szemeredi's theorem using Fourier analysis. 
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The triangle removal lemma of Ruzsa and Szemeredi [26] is one of the most influential applications 
of Szemeredi's regularity lemma. It states that any graph on n vertices with o(n^) triangles can be 
made triangle-free by removing o(n^) edges. It easily implies Roth's theorem |24] on 3-term arithmetic 
progressions in dense sets of integers. Furthermore, Solymosi [29] gave an elegant proof that the triangle 
removal lemma further implies the stronger corners theorem of Ajtai and Szemeredi [1], which states 
that any dense subset of the integer grid contains the vertices of an axis-aligned isosceles triangle. 

The triangle removal lemma was extended by Erdos, Frankl, and Rodl [9] to the graph removal 
lemma. It says that for each e > and graph H on h vertices there is 5 = (5(e, H) > such that 
every graph on n vertices with at most 6n^ copies of H can be made H-free by removing at most 
en^ edges. The graph removal lemma has many applications in graph theory, additive combinatorics, 
discrete geometry, and theoretical computer science. 

One well-known application of the graph removal lemma is in property testing. This is an ac- 
tive area of computer science where one wishes to quickly distinguish between objects that satisfy a 
property from objects that are far from satisfying that property. The study of this notion was ini- 
tiated by Rubinfield and Sudan [25j, and subsequently Goldreich, Goldwasser, and Ron [llj started 
the investigation of property testers for combinatorial objects. One simple consequence of the graph 
removal lemma is a constant time algorithm for subgraph testing with one-sided error (see [2j and its 
references). A graph on n vertices is e-far from being H-free if at least en^ edges need to be removed 
to make it H-hee. The graph removal lemma implies that there is an algorithm which runs in time 
Oe(l) which accepts all H-free graphs, and rejects any graph which is e-far from being H-free with 
probability at least 2/3. The algorithm samples t = 26^^ ^-tuples of vertices uniformly at random, 
where 6 is picked according to the graph removal lemma, and accepts if none of them form a copy of 
H, and otherwise rejects. Any H-free graph is clearly accepted. If a graph is e-far from being H-free, 
then it contains at least 6n^ copies of H, and the probability that none of the sampled /i-tuples forms 
a copy of H is at most (1 — 5)* < 1/3. Notice that the running time as a function of e depends on the 
bound in the graph removal lemma. 

Ruzsa and Szemeredi |26] derived the triangle removal lemma in the course of settling an extremal 
hypergraph problem asked by Brown, Erdos, and Sos [6]. Let gr{n.,v,e) he the maximum number 
of edges an r-uniform hypergraph may have if the union of any e edges span more than v vertices. 
Ruzsa and Szemeredi [26] use the triangle removal lemma to settle the (6, 3)-problem, which states 
that g^{n,6,3) = o(n^). Equivalently, any triple system on n vertices not containing 6 vertices with 3 
or more triples has o(n^) triples. This was generalized by Erdos, Frankl, and Rodl [S] using the graph 
removal lemma to establish gr{n, 3r — 3, 3) = o(n'^). 

For most of the applications of the graph removal lemma in number theory, new proofs using Fourier 
analysis were discovered which give better bounds (see, e.g., |15j . |28j). However, for the applications 
which are more combinatorial, no such methods exist. The only known proof of the graph removal 
lemma used the regularity lemma, leading to weak bounds for the graph removal lemma and its 
applications. Hence, finding a proof which yields better bounds by avoiding the regularity lemma is a 
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problem of considerable interest and has been reiterated by several authors, including Erdos [8], Alon 
[2], Gowers [S], and Tao [33]. 

Our main result is a new proof of the graph removal lemma which avoids using the regularity 
lemma and gives a better bound. 

Theorem 1. For each graph H on h vertices, if is a tower of twos of height 5h^\oge~^, then 
every graph G on n vertices with at most 6n^ copies of H can be made H-free by removing er? edges. 

For comparison, the regularity proof necessarily gives a bound on b^^ that is a tower of twos of 
height polynomial in e^^. 

We next sketch the proof idea of the regularity lemma and our proof of the graph removal lemma. 
At each stage of the proof of the regularity lemma, we have a partition V{G) = yi U . . . U Vfc of the 
vertex set of a graph G on n vertices into parts which differ in cardinality by at most 1. Let pi = \Vi\/n. 
The mean square density with respect to the partition is "^^Ki jKkPiPj^^Yi^^j)^ • ^ refinement of a 
partition V of a set V is another partition Q of V such that each member of Q is a subset of some 
member of V. If the partition does not satisfy the conclusion of the regularity lemma, then using 
the Cauchy-Schwarz defect inequality, the partition can be refined such that the mean square density 
increases by i^(e^) while the number of parts is at most exponential in k. This process must stop after 
0{e~^) steps as the mean square density cannot be more than 1. We thus get a bound on M{e) which 
is a tower of twos of height 0{e~^). 

Now we sketch the proof of Theorem [TJ Let H he a fixed graph with h vertices. We suppose for 
contradiction that G = {V, E) is a graph on n vertices for which en^ edges need to be removed to make 
it H-fiee and yet G contains less than 6n^ copies of H. We pass to a subgraph G' of G consisting 
of the union of a maximum collection of edge-disjoint copies of H in G. As the removal of the edges 
of G' leaves an H-bee subgraph of G, the graph G' has at least en^ edges. Let d = 2e(G')/n^ > 2e. 
At each stage of our proof, we have a partition y = Vi U . . . U 14 of the vertex set into parts such 
that almost all vertices are in parts of the same size. Let pi = \ Vi\/n. The mean entropy density with 
respect to the partition is Yli<i jKkPiPjfi'^i^i^^j)) where f{x) = xlogx for < x < 1 and /(O) = 0. 
A convexity argument shows that the mean entropy density with respect to any partition of V is at 
least dlogd. The fact that f{x) is nonpositive for < a; < 1 implies that the mean entropy density 
is always nonpositive. We prove a key lemma which shows how to "shatter" sets with few copies of 
H, and a Jensen defect inequality for such a shattering. These lemmas enable us to show that we 
can refine the partition such that the mean entropy density increases by i}{d) while the number of 
parts only goes up exponentially in c(e, h)k, where c(e, h) = 2^^/'^) ' '.So essentially in each iteration 
the number of parts is one exponential larger. This process must stop after 0{\ogd~^) = 0(loge~^) 
steps as the mean entropy density is at least dlogd, increases i}{d) at each refinement, and is always 
nonpositive. We thus get a bound on in the graph removal lemma which is a tower of twos of 
height 0(loge~^). 

In the next section, we prove a key lemma showing how to "shatter" sets with few copies of H 
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between them. In Section [3l we prove a Jensen defect inequality. We use these lemmas in Section H] to 
prove Theorem [TJ In the concluding remarks, we discuss several variants of the graph removal lemma 
for which we obtain similar improved bounds, and some open problems. We do not make any serious 
attempt to optimize absolute constants in our statements and proofs. All logarithms are assumed to 
be base e. 



2 Key Lemma 

The purpose of this section is to prove a key lemma, Lemma [SJ for the proof of Theorem [TJ Let H be 
a labeled graph with vertex set [h] := {1, . . . ,h}. Lemma [5] shows that if Vi, . . . ,Vh are vertex subsets 
of a graph such that there are few copies of H with the copy of vertex i in for i G [h], then there 
is an edge of H such that the pair (VijVj) can be shattered in the following sense. An {a,c,t)- 
shattering of a pair (^4, B) of vertex subsets in a graph G is a pair of partitions A = AiU . . .U Ar and 
B = BiL) . . .UBs such that r,s < t and the sum of |j4j| over all pairs (Ai, Bj) with d{Ai,Bj) < a is 
at least c[yl||i?|. Note that if a' > a, c' < c, and t' > t, then an (a, c, t)-shattering for a pair {A,B) is 
also an (a', c' , t')-shattering for {A, B). Before proving the key lemma, we first establish some auxiliary 
results on e-regular tuples in uniform hypergraphs. 

2.1 Regular tuples in hypergraphs 

A hypergraph T = {V, E) consists of a set V of vertices and a set E of edges, which are subsets of 
V . A hypergraph is k-uniform if every edge contains precisely k vertices. A fc-uniform hypergraph 
r = (V, E) is k-partite if there is a partition V = Vi L) . . . UVk such that every edge of T contains 
exactly one vertex from each Vi. In a hypergraph F, for vertex subsets Vi, . . . , Vk, let e{Vi, . . . , 14) 
denote the number of /c-tuples in Vi x • • • x 14 which are edges of T, and let d(Vi, . . . , Vk) = '^\vl \ ''' \ Vk\ ' 
which is the fraction of /c-tuples in Vi x • • • x T4 which are edges of H. 

We begin with a simple lemma which follows by an averaging argument. 

Lemma 1. Let T he a k-uniform hypergraph and Ai, . . . , A^ he nonempty vertex suhsets. If 1 < ai < 
\Ai\ fori E [k], then there are suhsets Bi,Ci C Ai each of cardinality ai such that d{Bi, . . . , B^) > 
>a((Ci,...,Cfc). 

Proof. By averaging, the expected value of d{Xi, . . . ,Xk) with Xi C Ai chosen uniformly at random 
with \Xi\ = Oi is d{Ai, . . . ,Ak)- Hence, there are choices of Bi,Ci C Ai for each i € [k] satisfying the 
desired properties. □ 

In a A;-uniform hypergraph T, a fc-tuple (Vi,...,T4) of vertex subsets is (a, P)-superregular if 
d{Ui, ... ,Uk) > /3 holds for all A;-tuples (Ui, ... ,Uk) with \Ui\ > a\Vi\ for i G [k]. 
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Lemma 2. Suppose T is a k-uniform hypergraph and Ai, . . . ,Ak are vertex subsets each of cardinality 
n with d = d{Ai, . . . , A^). If < a, /3 < 1/4 are such that d > 2(3 and {Ai, . . . , A^) is not (a, (3)- 
superregular, then there are Bi C Ai for i € [k] with \Bi\ = ... = [Bf^l > an and d{Bi, . . . , B^.) > 
{l + i)d. 

Proof. Since {Ai, . . . , A^) is not (a, /3)-superregular, there are subsets Ai^i C Ai such that > 
a\Ai\ and d{Ai^i, . . . , A^^i) < (3. By Lemma [H we may suppose that \Ai^i\ = [an] for i € [k]. Let 
Ai^2 = \ Ai^i, so \Aij \ > an for i E [k] and j E {1, 2}. 

Summing over ah (ji, ... ,jk) ^ {1, 2}'' with (ji, . . . , jfc) / (1, . . . , 1), we have 

i^ijii • • • i^fe,ij = i^ii • • • i^fci - 1^1,1 1 • • • i^fc.ii 

and 

. . , Ak,jJ\Aij^ \ ■ ■ ■ \AkjJ = ^e(Aij,, . . . ,^fc,jj = e{Ai,.. .,Ak)- e{Ai^i,.. .,Ak,i) 

= d{Ai, . . .,Ak)\Ai\- • • l^fcl - . . . ,^fc,i)|Ai,i| • • • \Ak,i\ 

> d\Ai\ ■ ■ ■ \Ak\ - (3\Ai^i\ ■ ■ ■ \Ak,i\. 

By averaging, there is [ji, ...,jk)^ {1, 2}^ with (ji, . . . , j^) 7^ (1, • • • , 1) such that 

> "^'it I ' ' ' It I " f^'l ' " "I'l^'l ' =d + id- (3)0/ {1 -c)>d + {d- P)a^ 
l^il ■ ■ ■ l^fcl ~ 1^1, i| ■ ■ ■ l^fc.il 

where c = j^^j"^^ > a^ . By Lemma [H for each i G [k] there is a subset Bi of yljj- of cardinahty 

[an] such that . . . , 5^) > + ^). □ 

The following lemma is a straightforward generahzation of a result of Komlos that dense graphs 
contain large superregular pairs. 

Lemma 3. Suppose T is a k-uniform hypergraph, and Ai, . . . , Ak are disjoint vertex subsets each of 
cardinality n. If < a, (3 < 1/4 are such that d{Ai, . . . ,Ak) > 2(3, then there are subsets Vi d Ai for 
i G [k] with \Vi\ = . . . = \Vk\ > a^° for which [Vi, . . . , Vk) is (a, (3) -superregular. 

Proof. We repeatedly apply Lemma [2] until we arrive at subsets C vlj of the same size for i G [k] 
such that {Vi, . . . ,Vk) is (a, /3)-superregular. In each application of Lemma [2] we pass to subsets 
each with size at least an a-fraction of the size of the original set and the density between them is 

k 

at least a factor (1 + ^) larger than the density between the original sets. After t iterations, the 

k I k J. 

density between them is at least (1 + ^) d{Ai, . . . , A^) > (1 + ^) 2/3. This cannot continue for 
more than 2>a~^ log/3~^ iterations since otherwise the density would be larger than 1. Hence, we have 
|Vi| = • • • = |Vfc| > a^"^ * log/3 which completes the proof. □ 
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The next lemma allows us to find a large matching of regular fc-tuples. 

Lemma 4. Suppose a,/3,c,d > with a,/3 < 1/4 and d > 2/3, T is a k-uniform hypergraph, and 
(Ai, . . . , Ak) is a {c,d)-superregular k-tuple of disjoint vertex subsets each of cardinality N. Then 
there is a positive integer r such that for each i € [k] there is a partition Ai = Ai^Q U Ai^i U . . . U Ai^j. 
with \Aifi\ < cN, and for each j G [r] the k-tuple {Ai j, . . . ,Akj) is {a, (3)-superregular with \Aij\ = 
l^sjl = ••• = \Ak^j\ > a3°"'i°g/3"'ciV. 

Proof. In the first step, we pick out subsets Ai^i C Ai for i G [k] such that the /c-tuple {Ai^i, . . . , 

is (a, /3)-superregular and \Ai^i\ = . . . = \Af^^i\ > a^" ''log/? ^^-^^ ^-j-^^g j-jy Lemma El since the 

/c-tuple (^1, . . . , Ak) is (c, (i)-superregular and hence d{Ai, . . . , A^) > d> 2/3. 

Suppose we have already picked out Ai^i for i & [k],i £ [j] satisfying that for each £, {Ai^i, . . . , A^^i) 
is (a, /3)-superregular, and l^i^^l = ••• = \Ak/\ > a^°^"* ^"^''"'cTV. Let Bi = Ai \ (Jfej so |-Bi| = 
• • • = \Bk\. If |i?i| < cN, then we let Ai^Q = Bi for i G [A;] and the proof is complete. Otherwise, we 
pick out subsets ^ij+i C Bi for i G [A;] satisfying 

\A^,+,\ = ■ ■ ■ = \Ak,,+i\ > a3"-'^-i°g/3-^|i?,[ > aSa-'^iog/S-^^ 

and (Aij+i, . . . , Afcj+i) is (a, /3)-superregular. We can do this by Lemma [3] since {Ai, . . . , Ak) is 
(c, (i)-superregular, \Bi\ > cN = c\Ai\ for i G [k], and hence d{Bi, . . . , Bf^) > d> 2/3. As each Aij 
has cardinality at least a^" '''iog/3 ^^jy^ process terminates in at most A^/ ^q^" ''iog/3 ^^jy^ = 
c~^a~^" log/3 1 steps, and when this happens, we have the desired partitions. □ 

2.2 Shattering sets with few copies of H 

The following lemma is the main result of this section and is crucial for the proof of Theorem [TJ Before 
going into the precise statement and proof, we give a rough sketch. Let if be a graph with vertex set 
[h] and suppose G is a graph with disjoint vertex sets Vi, . . . ,Vh of the same size with few copies of 
H with the copy of vertex i in for i £ [h]. The lemma then says that there is an edge {i,j) of H for 
which there is an (a, c, i)-shattering of {Vi,Vj), where c > depends only on h and t is not too large 
as a function of a and h. 

The proof is by induction on h, with the base case h = 2 being trivial. Let H' be the induced 
subgraph of H with vertex set [/i — 1]. The proof splits into two cases. In the first case, there are large 
subsets V- C Vi with few copies of H' between V(, . . . , Vl_^ with the copy of vertex i lying in V-. In 
this case, by induction, we can shatter a pair (V^', Vj) with {i,j) an edge of H' (and hence of H), and 
this extends to a shattering of {Vi, Vj), completing this case. 

In the second case, for all large subsets V- C Vi there are a substantial number of copies of H' 
between V(, . . . , V^_]^ with the copy of i lying in V- . We create an auxiliary (h — l)-partite {h — 1)- 
uniform hypergraph T with parts Vi, . . . , Vh~i where [yi, . . . , Vh-i) G Fi x . . . x Vh-i is an edge of F 
if these vertices form a copy of H' in G with vertex Vi the copy of i. In this case we can use Lemma H] 
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to partition each Vi = Vi^o U . . . U . . . U Vi^z with i £ [h — 1] such that for each j G [z] the {h — l)-tuple 
{Vij, . . . ,Vh-i.j) is (a, /3)-superregular in F with /3 not too small, \Vij\ = . . . = \Vh-ij\ is large, and 
\Vifi\ not too large. By this superregularity and the definition of T, each vertex v £ Vh which has for 
some j at least a|Vij | neighbors in Vij for each neighbor z of /i in is a vertex of many copies of H 
in G with the copy of i in V^. As there are few copies of H with the copy of i in for each i, this 
implies that for each j, there are few vertices in Vh which have at least a|Vij| neighbors in Vij for 
each neighbor i of h. In other words, for most vertices v £ Vh there is a neighbor i of h such that v 
has less than ajV^j l neighbors in Vij. We partition Vh where a vertex v £ Vh lies in a certain subset 
in this partition depending on which pairs with i a neighbor of h in H and j € [z] the vertex v 

has less than a|Vij| neighbors in Vij. We get that for some neighbor z of /i in H, this partition of Vh 
and the partition of Vi form an (a, c, t)-shattering of {Vi, Vh)- 

Lemma 5. Let < a < 1/4 and dh = 2'^^/")'' . Let H he a graph with vertex set [h\. Suppose G is 
a graph with disjoint vertex subsets Vi, . . . ,Vh each of size n such that the number of copies of H with 
the copy of vertex i in Vi for i S [h] is at most dhU^ ■ Then there is an edge {i,j) of H for which there 
is an (a, 2^'i ) -shattering of the pair (VijVj). 

Proof. The proof is by induction on h. In the base case h = 2, as the number of edges between Vi 
and V2 is at most d2n'^ < ar? , the trivial partitions of V\ and V2 form an (a, 1, l)-shattering of the 
pair (Vi, V'l)- Thus the base case holds. The induction hypothesis is that the lemma holds for h — 1. 

Let H' be the induced subgraph of H on vertex set [h — 1]. Let F be the {h — l)-partite {h — 1)- 
uniform hypergraph on Vi, . . . , Vh-i such that [vi, . . . , Vh-i) € Vi x . . . x Vh-i forms an edge of F if 
{vi,Vj) is adjacent in G whenever {i,j) is an edge of H'. 

The proof splits into two cases, depending on whether or not (Vi, . . . ,Vh-i) is (1 — 
super regular in F. 

Case 1: {Vi, . . . , Vh-i) is not (1 — (i/i_i)-superregular in F. In this case, there are sets Wi C Vi for 
ie[h-l] with \Wi\ > {l-^)\Vi\ and d{Wi, . . . ,Wh-i) < dh^i- By LemmafU letting n' = [(l-i)n], 
we may suppose further that \Wi\ = . . . = \Wh-i\ = n' . Therefore, the number of copies of H' with 
the copy of vertex i m.Vi for i G [/i — 1] is at most dh-in'^~^ . By the induction hypothesis, there is an 
edge {i,j) of H' (and hence also of H) and partitions Wi = Ai\J . . .U Aj—i and Wj = BiU . . .U Bg^i 
with r — l,s — 1<2 and the sum of over all pairs {Ap, Bg) with d{Ap,Bq) < a is at least 



{h - l)-^\Wi\\Wj\ >{h- _ ^f\Vi\\Vj\ = h-^\Vi\\Vj\. Letting Ar = Vi\ Wi and Bs = Vj \ Wj, 



Case 2: {Vi, . . . ,Vh-i) is (1 — ^, (i/j_i)-superregular in F. In this case, by Lemma HI there are 
partitions Vi = Vi^o U Vi^i U . . . U Vi^z for i € [h - 1] with \Vifi\ < (1 - j[)\Vi\ = (1 - ■^)n such that 
for each j G [z] the (/i — l)-tuple (Vij, . . . ,Vh-ij) is (a, /3)-superregular in F with /3 = dh~i/2, and 
l^i.il = 1^2,^1 = ■•■ = \Vh-i,j\ > in where 





-h 



> d\ 



■h-1 — 




> 2-(2/°) 
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Since each Vij has cardinaUty at least 772 and each Vi has cardinahty n, wc have z < = 7" . 

Let / denote the set of neighbors of h in H. Suppose for contradiction that there is j € [z] such 
that at least \Vh\/h vertices v € Vh have at least neighbors in Vij for all i € I. For i I, let 

N{v,i) denote the set of neighbors of v in Vij, and for i € [h — 1]\ I, let N{v,i) = Vij. So for at 
least |V/j|//i vertices v € V^, \N{v, i)\ > a\Vij\ for i £ [h — 1]. Since the {h — l)-tuple (Vij, . . . , Vh-ij) 
is (a,/3)-superregular in T, the number of copies of H containing such a fixed v and with the copy of 
vertex i in Vij for i G [/i — 1] is at least 

miv, 1)1 •• • ^ - 1)1 > P^^'-'lVijl ■ ■ ■ \Vh-i,j\ > Pa'^-' iinf ■ 
Hence, the number of copies of H with the copy of vertex z in for i G [h] is at least 

contradicting that there are at most dh'nP' copies of H with the copy of vertex z in for z G [h] . 

So, for every j G [z], less than |V/j|//i vertices v have at least Q!|Vij| neighbors in V^j for all 
i e I. For each subset S C I x [z], let As denote the set of vertices v e Vh with less than Q!|V^j| 
neighbors in Vij for all G S and at least a|Vij| neighbors in V^j for all G (/ x [z]) \ S. 

We have Vh = \Jseix[z]^S is a partition of Vh into 2^^^^ subsets. As for each j G [z], we have 
\Vi,j\ = • • • = |V/i_ij| and more than (1 — l//i)|V/i| vertices in Vh have less than Q;|l^j| neighbors in Vij 
for some i E I, the sum of | over all 5 C / x [z] and i G / for which d{As, Vij) < a is more than 

(1 — l//i)| VftI |Vij |. Summing over all j G [z], the sum of |^5||Vij[ over all C / x [z], i G /, and j G [z] 
for which d{As,Vij) <Q is at least Eje^ll - l/^)l"^).ll^ijl > (1 - l/^')l"^4j(|Vi|//i) = {l-l/h)h-^n^. 
Hence, there is i G / such that the sum of |^5||Vij[ over all C /x [z],j G [z] for which d{As, Vij) < a 
is at least ^^{l-l/h)h-^n^ > h-^n^. As also z+ 1, 2l^l^ < 2(^"i)"^ < 2'^h\ it follows that the partitions 

= Usc-rx[2] and Vi = \Jo<j<z ^iJ form an (a, /i"^ , 2'^'. ' )-shattering of the pair (Vi, Vh). □ 

3 A defect inequality for convex functions 

Jensen's inequality states that if / is a convex function, ei, . . . , are nonnegative real numbers which 
sum to 1, and real numbers, then 

ei/(a;i) H h esf{xs) > f{eixi H h egXg). 

The following lemma is a simple consequence of Jensen's inequality. 

Lemma 6. Let f : IR>o — > M 6e a convex function, ei,...,es and xi,...,Xs be nonnegative real 
numbers with J2i<i<s^i ~ 1- -^'^'^ ^ ^ ~ Sie/ ^^^^^ < c < 1, u = '^i^j eiXi/c, and 

V = J2ie[s]\i ^i^i/i'^ - c), we have 

E ^■Ji^i)>cf{u) + {l-c)f{v). 

l<i<s 
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Proof. By Jensen's inequality, we have 

Since c = X^jg/ and 1 = X]i<j<5 Cj, then 1 — c = X]jG[s]\/ -^y Jensen's inequahty, we have 

From the two previous inequaUties, we get 

cf{u) + {l-c)f{v) < e,f{xi). 

l<i<s 

□ 

Note that equahty holds in Jensen's inequality when the numbers equal. A defect 

inequality shows that if these numbers are far from being equal, then Jensen's inequality can be 
significantly improved. The following lemma is a defect inequality for a particular convex function 
which we will use in the proof of Theorem [TJ The lemma assumes that a proportion c of the weight is 
distributed amongst numbers which are at most 1/10 of the average. 

Lemma 7. Let f : ]R>o — ?• M 6e the convex function given by f{x) = a; log a; for x > and /(O) = 0. 
Let ei, . . . , Eg, and xi, . . . , x^, he nonnegative real numbers with J2i<i<s ~ ~ Si<j<s 

Suppose /3 < 1 and L C [s] is such that Xi < /3a for i E L and let c = ^j^/ Cj. Then 

Y e^/(x.)>/(a) + (l-/3 + /(/3))ca. 

l<i<s 

Proof. Notice that if a or c is 0, the desired inequality is Jensen's inequality. We may therefore assume 
a, c > 0. We also have c < 1 as otherwise c = 1, ej = for i € [s]\I, and a = Yli<i<s ~ Sis/ — 
/3a as Xi < (3a for i G /, a contradiction. Let u = ^jg/ eiXi/c, which is a weighted average of the Xi with 
z G /, and v = ^i^^s\\i ^i^i/i'^ — c). Let 5 = u/ a, so 5 < j3, and 5' = v / a = {1 — 5c) / {1 — c) = 1+ ■ 
Note also that cu = ca5, (1 — c)v = a(l — c)5', and cn + (1 — c)v = a. Hence, by Lemma [H we have 

^ e./(x.) > cfiu) + il-c)fiv) = fia) + caf{6)+a{l-c)fi5') 

l<i<s 

> /(a) + cafi6) + a(l - c)^^^ = /(a) + + 1-6) ca, 

1 — c 

where the first equality follows from substituting in f(x) = xlogx for < x < 1 and /(O) = 0, and 
the second inequality follows from substituting x = 6' into the inequality f{x) > x — 1 for x > 0. Since 
< 5 < /3 < 1, and /(x) + 1 — x is a decreasing function on the interval [0, 1], we get the desired 
inequality. □ 
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4 Proof of Theorem [T] 



In this section we prove Theorem [TJ Our presentation is chosen to elucidate the similarities and 
differences with the well known proof of Szemeredi's regularity lemma. 

Let G = {V, E) be a graph. Recall that for vertex subsets A and B, e{A, B) denotes the number of 
pairs (a, 6) G A X S that are edges of G and ^(^4, B) = is the density of the pair [A, B), which 

is the fraction of pairs {a,b) ^ A x B that are edges of G. For a function / : ]R>o — )• M define 



f{A,B) = \^fid{A,B)). 



For partitions A oi A and B of B, let 



A'eA,B'eB 



andf{A) = f{A,A). 



Lemma 8. Let f : M>o — t- M &e a convex function, G = {V, E) he a graph, and d = diV, V) = 
2\E\/\V\^. 

1. For vertex subsets A, B C V and partitions A of A and B of B, we have f{A,B) > f{A,B). 

2. If V is a partition ofV, then f{V) > f{d). 

3. IfV and V' are partitions ofV and V' is a refinement ofV, then f{V') > f{V). 

4- Suppose that A,B are vertex subsets with d{A,B) > 10a, partitions A of A and B of B form an 
{a, c,t)- shattering of {A,B), and f{x) = x\ogx for x > and /(O) = 0. Then 

f{A,B)>f{A,B) + p^. 

Proof. We have 

f{AB) = fi^'^B')= Y ^4M^fid{A',B')) 

A'eA,B'eB A'£A,B'eB ' ' 

= E iSw>^'))>^/KAi3)) = /(Ai?). 

' ' A'eA,B'€B I II I II 

where we used z^A'eA B'eB |A||B| ~ ^ ^^"^ Jensen's inequality. This establishes part 1. For part 2, 
note that if "P is a partition of V, then by part 1 we have 

f{r) = f{v,v)>f{v,v) = f{d). 

Part 3 is an immediate corollary of part 1. For part 4, we use Lemma [7] such that for each A' £ A and 



\A'\ 


\B'\ 


\A\ 


\B\ 
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a = eiXi = d{A, B), /3 = 1/10, and / be the set of i such that Xi < a < f3a. Since ^ is a partition 
of A and ;B is a partition of the sum of all is 1. By the definition of an (a, c, i)-shattering, we 
have Yliiei — conclude that 

/(A B) = ^ ^^/(■^•') ^ ^ + - ^ + /(^))) ^ /(^' ^) + Pwf^- 

□ 

The next lemma shows how to refine a partition into not too many parts so that almost all vertices 
are in parts of the same size, and the remaining vertices are in parts of smaller size. 

Lemma 9. Suppose Q is a partition of a set V of size n into at most k parts and v > 0. Then there 
is a refinement Q! of Q into at most {2v''^ + l)k parts and a number r such that all parts have size 
at most r, and all hut at most vn vertices are in parts of size r. 

Proof. If A; > vn, then let r = 1 and Q' be the partition of V into parts of size 1. Otherwise, let 
r = [vn/k\. Refine each part in Q into parts of size r, with possibly one remaining part of size less 
than r. The number of parts is at most n/r + A; < {2v~^ + l)k. The number of vertices in parts of 
size less than r is at most kr < vn. □ 

The next lemma allows us to refine a vertex partition of a graph with many edge-disjoint copies of 
H but with relatively few (total) copies of H so that the mean entropy density increases significantly, 
while the number of parts is roughly one exponential larger. 

Lemma 10. Let H be a graph on h vertices. Suppose G = {V, E) is a graph on n vertices, whose edge 
set can he partitioned into e^n^ copies of H. Let uq < he a positive integer and V be a partition 
of V into at most T parts with all parts of size at most uq, and all hut at most vertices in parts 
of size no- Suppose further that G has at most 2^^^^/'^'')'' T~^n^ copies of H. Let f{x) = a;log.x for 
X > and /(O) = 0. Then there is a refinement V' of V with at most parts with s = 2^ ' " , 
such that f{V') > f{V) + ^ and all but at most ^ri vertices are in parts of equal size, and all other 
parts are of smaller size. 

Proof. We refine the partition V as follows. Let a = eo/20, c = h~'^, and t = 2^'^''"^ . For every 
pair Pj, Pj eV of distinct parts each of size no for which there is an (a, c, t)-shattering of (Pj, Pj), let 
Vij denote the partition of Pj and Vji denote the partition of Pj in an (a, c, t)-shattering of the pair 
(Pj, Pj). For each i, let Vi be the partition of Pi which is the common refinement of all partitions 'Pij , 
so Vi has at most t^ parts. Let Q be the partition of V consisting of all parts of the partitions Ti. As 
each of the at most T parts of P is refined into at most parts, the number of parts of Q is at most 
Tt'^. 

Let G' be the subgraph of G obtained by deleting edges which are inside parts of V, contain a 
vertex in a part of V of size not equal to no, or go between parts of V with density less than €q/2. 
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The number of edges inside parts is at most nno/2 < eon.^/8. As all but at most ^re vertices are in 
parts of size uq, the number of edges containing a vertex in a part of size not equal to no is at most 
^n^. The number of edges between parts of density less than eo/2 is at most (eo/2)n^/2 < eon^/4. So 
the number of edges of G deleted to obtain G' is at most + + = eon^/2. Hence, 

G' contains at least eon^ — eon^/2 = eon^/2 edge-disjoint copies of H. Each copy of H in G' has its 
vertices in different parts each of size no, and its edges go between parts with density at least eo/2. As 
every part of V has size at most no and there are T parts, no > n/T. Note that the number of copies 
of F in G is at most 2-(40/^o)''V~^n'' = dh{n/Tf < dhu^. For each copy of H in G' , by Lemma El 
at least one of its edges goes between parts which are (a, c, t)-shattered. Hence, the number of edges 
of G which are between parts of size no with density at least = 10a between them and which are 
(a, c, t)-shattered is at least the number of edge-disjoint copies of H in G', which is at least eon^/2. 
By Lemma [8l parts 1 and 4, we have 

HQ) > m + 1^%^ > m + 1^ > m + ceo/4 = m + ^, 

where the sum is over all pairs {Pi,Pj) of parts of V of size no with i < j and density at least ^ = 10a 
between them that are (a, c, t)-shattered. 

By Lemma [9] with f = ^, there is a refinement V' of Q into at most 

{2v'^ + 1)|Q| < (16eo ^ + l)Tt^ < lle^^Tf < 

parts, such that all but at most ^n vertices are in parts of equal size, and all other parts are of smaller 
size. By Lemma[8l part 3, we have f{V') > f{Q) > f{V) + j^, which completes the proof. □ 

We now have the necessary lemmas for the proof of Theorem [TJ 

Proof of Theorem [TJ Suppose for contradiction that there is a graph G on n vertices with at most 
5n^ copies of H and for which ev? edges need to be removed from G to make it H-bee. Let G' be a 
subgraph of G which consists of the union of a maximum collection of edge-disjoint copies of H in G. 
As the removal of the edges of G' from G leaves an H-free subgraph of G, the graph G' has at least 
en^ edges. Let eon^ denote the number of edge-disjoint copies of H in G' , so e(G') = e{H)eon'^. 

As there is at least one and at most 6n^ copies of H, we have n > 6^^^^. Let Vq be an arbitrary 
partition V = Vi U . . . U Vk oi the vertex set of G' into parts of size no = f^n], except possibly one 
remaining set of size less than ^n. The number po of parts of Vq is at most 8eQ ^ + 1 < 5/i^e~^. By 
Lemma [H part 2, we have f{Vo) > f{d) = dlogd, where d = 2e{G')/-n? > 2e. We repeatedly apply 
Lemma [TO] to obtain a sequence of partition refinements VotVi, . . ., and we let pi denote the number 
of parts of Vi- Once we have the partition Vi, as long as 5 < 2~^'^^^^°^ P^^ i can apply Lemma 
[To] to obtain a refinement Vi+i of Vi. After i iterations, f{Vi) > fiVo) + and pi < s^'-^, where 

8 = 2^^ " . Roughly, at each iteration the number of parts is one exponential larger than in the 
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previous iteration. As 6 ^ is a tower of twos of height 5/i'^loge ^, this process continues for at least 
iQ := [4/i^loge~^] iterations. Also using the inequalities /i^eo > 2e{H)eQ = d and d > 2e, we have 



^>dlogd+(4/.^loge-)^ 
which contradicts that / applied to any partition is nonpositive. □ 



fiVio) > /(T'o) + ^0^ >dlogd+ (4/^4 log e-i) ^ = a!logd + /i^eologe-i > dlog(d/e) > 0, 



5 Concluding remarks 

We gave a new proof of the graph removal lemma with an improved bound. Below we discuss improved 
bounds for several variants of the graph removal lemma and finish with some open problems. 
Removing homomorphisms. There is a seemingly stronger variant of the graph removal lemma 
mentioned in ^ which we refer to as the homomorphism removal lemma. It states that for every 
graph H on h vertices and every e > 0, there is > such that if G is a graph on n vertices with at 
most Sn^ copies of H, then en^ edges of G can be removed to obtain a graph G' for which there is 
no homomorphism from H to G' . It is rather straightforward to obtain this result from Szemeredi's 
regularity lemma. However, one can further show that the 6 in the homomorphism removal lemma is 
closely related to the 5 in the graph removal lemma, and thus Theorem [1] implies a similar improved 
bound in the homomorphism removal lemma. The proof of this fact is quite simple, so we only sketch 
it below. 

Suppose G is a graph on n vertices which has at most 6n^ copies of H. A homomorphic image 
of a graph is a graph F for which there is a surjective homomorphism from H to F. As each 
homomorphic image of H has at most \H\ vertices, the number of homomorphic images of H is finite. 
Notice that to remove all homomorphisms from H to G, it suffices to remove all copies of homomorphic 
images of H in G. If there are few copies in G of each homomorphic image of H, then by the graph 
removal lemma we can remove few edges and remove all homomorphisms from H to G. So there must 
be a homomorphic image F of H for which there are many copies of F in H, say cn^ with c > 6^ ^ , 
where k is the number of vertices of F. Let / be a surjective homomorphism from H to F, and for 
each vertex i of F, let denote the number of vertices of H which map to vertex i in /. The blow-up 
F(ai, . . . , Ofc) of F is the graph obtained from F by replacing each vertex i by an independent set 
li of order Oj, and a pair of vertices in different parts Jj and Ij are adjacent if and only if i and j 
are adjacent in F. Note that H is a subgraph of the blow-up F{ai, . . . ,afc). Let S denote the set 
of sequences (vi, . . . ,Vk) of k vertices of G which form a copy of F with Vi the copy of vertex i. If 
Ai, . . . , Ak are vertex subsets of G with \Ai\ = Oj and all fc-tuples in Ai x ■ ■ ■ x Aj. belong to S, then 
these vertex subsets form a copy of F{ai, . . . , a^) in G, and hence also make a copy of H in G. As 
G has cn^ copies of F, a simple convexity argument as in [10] shows that if c ^> ^-i/("i - "fc)^ then 
S contains at least (1 — o{l))c°'^"'°''"n"'^~^ = (1 — o(l))c"i'"°*n'' fc-tuples of disjoint vertex subsets 
(^1, . . . , Ak) with \Ai\ = ai and Ai x ■ ■ ■ x A^ C S. Thus, G contains at least 

(1 - o(l))c"i -"'=n^ > (1 - o{l)6^'^'^'/^'>\^ > h\5n^' 
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labeled copies of H, where we use ai • • • < 3'^/^ as ai, . . . ,ak are positive integers which sum to h, 
and c> 6^ '' . This contradicts G has at most 6n^ copies of H. 

Directed, colored, and arithmetic removal lemmas. The directed graph removal lemma, proved 
by Alon and Shapira [3j, states that for each directed graph H on h vertices and e > there is 
6 = 6{e, H) > such that every directed graph G = {V, E) on n vertices with at most 5n^ copies 
of H can be made H-ivee by removing at most en^ edges. The proof of Theorem [T] can be slightly 
modified to obtain a similar bound as in Theorem [1] for the directed graph removal lemma. The proof 
begins by finding a subgraph G' of G which is the disjoint union of e'n^ copies of H, with e' > 
There is a partition V = Vi U . . . UV^ with at least h~^e'v? edge-disjoint copies of H with the copy 
of vertex i vaVi. Indeed, in a uniform random partition into h parts, each copy of H has probability 
that its copy of vertex i lies in Vi for all i S [h]. We then let G" be the subgraph of G' which 
consists of the union of these at least 2h^^^'^en? edge-disjoint copies of H. The proof of the directed 
graph removal lemma is then essentially the same as the proof of Theorem [H except we start with the 
partition V = ViU . . .\JVh and refine it further at each step. 

There is also a colored graph removal lemma. For each e > and positive integer h, there is 
5 = 5{e,H) > such that if cp : E[H) [k] is a A;-edge-coloring of the edges of a graph H on h 
vertices, and : E[G) — )• [k] is a A;-edge-coloring of the edges of a graph G on n vertices such that 
the number of copies of H with coloring in the coloring ip oi G is at most 5n^ , then one can remove 
all copies of H with coloring (j) by deleting at most en^ edges of G. We can also obtain a similarly 
improved bound on the colored graph removal lemma, and the proof is identical to the proof of the 
directed graph removal lemma. 

Green JH] developed an arithmetic regularity lemma and used it to deduce an arithmetic removal 
lemma. It states that for each e > and integer m > 3 there is 5 > such that if G is an abelian 
group of order N, and Ai, . . . ,Am are subsets of G such that there are at most 5N^^^ solutions to 
the equation ai + - ■ ■ + am = with € Ai for all i, then it is possible to remove at most eA^ elements 
from each set Ai so as to obtain sets A'^ for which there are no solutions to a'^ -|- • • • -|- a'^ = with 
E A'^ for all i. Like Szemeredi's regularity lemma, the bound on (5~^ grows as a tower of twos of 
height polynomial in e^^. Green's proof of the arithmetic regularity lemma relies on techniques from 
Fourier analysis and does not extend to nonabelian groups. Krai, Serra, and Vena [20] found a new 
proof of Green's removal lemma using the directed graph removal lemma which extends to all groups. 
They proved that for each integer m > 3 and e > there is 5 > such that the following holds. Let 
G be a group of order A, ^i, . . . , Am be sets of elements of G, and g be an arbitrary element of G. 
If the equation xiX2 • • ■ Xm = 9 has at most 5N^~^ solutions with Xi G Ai for all i, then there are 
subsets A[ C Ai with \Ai \A[\ < eN such that there is no solution to xiX2 • • • Xm = g with Xj G A'- for 
all i. Their proof relies on the removal lemma for directed cycles, and we thus obtain a new bound 
for this removal lemma as well. 

Further directions. Alon [2] showed that the largest possible 5{e, H) in the graph removal lemma 
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has a polynomial dependency on e if and only if H is bipartite. For nonbipartite H, he showed that 
there is c = c{H) > such that 6{e,H) < (e/c)'^'°s(^/'^). Note that this upper bound is far from the 
lower bound provided by Theorem[Tl and it would be extremely interesting to close the gap. Similarly, 
Alon and Shapira [3] determined for which directed graphs H the function 6{e, H) in the directed 
graph removal lemma has a polynomial dependency on e. It is precisely when the core of H, which 
is the smallest subgraph K oi H for which there is a homomorphism from H to K, is an oriented 
tree or a directed cycle of length 2. A similar bound also holds for Green's removal lemma. All of 
the superpolynomial lower bounds are based on variants of Behrend's construction [5] giving a large 
subset of the first n positive integers without a three-term arithmetic progression. 

A great deal of research has gone into proving a hypergraph analogue of the removal lemma |16) . 
[17j . |22) . |23] . |32| . leading to new proofs of Szemeredi's theorem and some of its extensions. Using 
a colored version of the hypergraph removal lemma, Shapira [27] and independently Krai, Serra, and 
Vena [21] proved a conjecture of Green establishing a removal lemma for systems of linear equations. 
It would be interesting to find new proofs of these results without using any version of the regularity 
lemma. 
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